Music Remixing and Upmixing Using Source Separation
نویسندگان
چکیده
Current research on audio source separation provides tools to estimate the signals contributed by different instruments in polyphonic music mixtures. Such tools can be already incorporated in music production and post-production workflows. In this paper, we describe recent experiments where audio source separation is applied to remixing and upmixing existing mono and stereo music content. 1. AUDIO SOURCE SEPARATION USING DEEP NEURAL NETWORKS Audio source separation algorithms have progressed a long way in recent years, moving on to algorithms that exploit prior information in order to estimate time-frequency masks [1]. For example Deep Neural Networks (DNN), are used in a supervised setting that strongly depends on available training data. In exchange, using supervised training frees them from assumptions needed in other algorithms, such as having recordings from multiple microphones or dealing with repetitive music structures. DNNs are trained to estimate timefrequency masks which still rely on the assumption that energy from different sound sources does not overlap in the time-frequency plane. While applying hard (binary) masks to spectrograms achieves good separation, many noticeable artifacts are introduced. Soft masks produce better sounding results, but imperfect separation. Results from soft masks can still be recombined in remixing and upmixing applications. In this paper we describe two recent prototypes that allow repurposing of musical audio using popular instrument classes. While perceptual evaluation is still pending, both can be used to provide convincing results. 2. REPURPOSING MUSICAL AUDIO The general idea is to use time-frequency masks estimated from DNN models [2] to upmix and remix musical audio. This means that we are able to make audio content interactive by providing the user with controls for remixing or upmixing, not unlike using an intelligent equalizer that knows about the instrument sounds in the mixture. Our prototypes use models trained using the dataset from the SiSEC MUS challenge [3], where sources have been consistently annotated according to common popular music instrument categories (vocals, bass, STFT Mix DNN
منابع مشابه
Remixing musical audio on the web using source separation
Research in audio source separation has progressed a long way, producing systems that are able to approximate the component signals of sound mixtures. In recent years, many efforts have focused on learning time-frequency masks that can be used to filter a monophonic signal in the frequency domain. Using current web audio technologies, time-frequency masking can be implemented in a web browser i...
متن کاملRemixing Stereo Music with Score-Informed Source Separation
Musicians and recording engineers are often interested in manipulating and processing individual instrumental parts within an existing recording to create a remix of the recording. When individual source tracks for a stereo mixture are unavailable, remixing is typically difficult or impossible, since one cannot isolate the individual parts. We describe a method of informed source separation tha...
متن کاملInformed Audio Source Separation from Compressed Linear Stereo Mixtures
In this paper, new developments concerning a system for informed source separation (ISS) of music signals are presented. Such system enables to separate I > 2 musical instruments and singing voices from linear instantaneous stationary stereo (2-channel) mixtures, based on audio signal natural sparsity, pre-mix source signal analysis, and side-information embedding (within the mix signal). The f...
متن کاملScore-Informed Sparseness for Source Separation
Audio source separation is a useful preprocessing step for remixing or transcription of music. It can be shown, that the separation quality increases, if the separation algorithm gets additional side information, e.g. the score of the current mixture [5]. In many cases the score of a musical piece is not available and has to be extracted by a professional musician or an automatic music transcri...
متن کاملNew Sonorities for Jazz Recordings: Separation and Mixing Using Deep Neural Networks
The audio mixing process is an art that has proven to be extremely hard to model: What makes a certain mix better than another one? How can the mixing processing chain be automatically optimized to obtain better results in a more efficient manner? Over the last years, the scientific community has exploited methods from signal processing, music information retrieval, machine learning, and more r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016